Transfer learning for cross-lingual automatic speech recognition
نویسنده
چکیده
In this study, an instance based transfer learning phoneme modeling approach is presented to mitigate the effects of limited data in a target language using data from richly resourced source languages. A maximum likelihood (ML) learning criterion is introduced to learn the model parameters of a given phoneme class using data from both the target and source languages. Each phoneme was modeled using a 3 state, 1 Gaussian mixture HMM. Turkish and English were chosen to be the target and source languages respectively. It was found that using only 20 utterances from Turkish, the monophone recognition accuracy in Turkish using transfer learned HMMs is close to the levels of accuracy achieved using standard HMMs when 100 or more utterances from the Turkish training corpus were used.
منابع مشابه
Semi-Supervised and Cross-Lingual Knowledge Transfer Learnings for DNN Hybrid Acoustic Models Under Low-Resource Conditions
Semi-supervised and cross-lingual knowledge transfer learnings are two strategies for boosting performance of lowresource speech recognition systems. In this paper, we propose a unified knowledge transfer learning method to deal with these two learning tasks. Such a knowledge transfer learning is realized by fine-tuning of Deep Neural Network (DNN). We demonstrate its effectiveness in both mono...
متن کاملCross-Lingual Approaches: The Basque Case
Cross-lingual speech recognition could be relevant for Multilingual Automatic Speech Recognition (ASR) systems which work with under-resourced languages and appropriately equipped languages. In the Basque Country, the interest on Multilingual Automatic Speech Recognition systems comes from the fact that there are three official languages in use (Basque, Spanish, and French). . Multilingual Basq...
متن کاملCross-lingual transfer learning during supervised training in low resource scenarios
In this study, transfer learning techniques are presented for cross-lingual speech recognition to mitigate the effects of limited availability of data in a target language using data from richly resourced source languages. First, a maximum likelihood (ML) based regularization criterion is used to learn context-dependent Gaussian mixture model (GMM) based hidden Markov model (HMM) parameters for...
متن کاملMulti-lingual phoneme recognition exploiting acoustic-phonetic similarities of sounds
The aim of this work is to exploit the acoustic-phonetic similarities between several languages. In recent work cross{ language HMM-based phoneme models have been used only for bootstrapping the language{dependent models and the multi{lingual approach has been investigated only on very small speech corpora. In this paper, we introduce a statistical distance measure to determine the similarities...
متن کاملMultilingual Training and Cross-lingual Adaptation on CTC-based Acoustic Model
Phoneme-based multilingual training and different crosslingual adaptation techniques for Automatic Speech Recognition (ASR) are explored in Connectionist Temporal Classification (CTC)-based systems. The multilingual model is trained to model a universal IPA-based phone set using CTC loss function. While the same IPA symbol may not correspond to acoustic similarity, Learning Hidden Unit Contribu...
متن کامل